Hybrid Machine Learning-Based Approach for Anomaly Detection using Apache Spark

نویسندگان

چکیده

Over the past few decades, volume of data has increased significantly in both scientific institutions and universities, with a large number students enrolled high related data. Furthermore, network traffic post-pandemic use online learning. Therefore, processing is complex challenging task that increases possibility intrusions anomalies. Traditional security systems cannot deal such high-speed big traffic. Real-time anomaly detection should be able to process as quickly possible detect abnormal malicious This paper proposes hybrid approach consisting supervised unsupervised learning for based on engine Apache Spark. Initially, k-means algorithm was implemented Sparks MLlib clustering traffic, then each cluster, K-nearest neighbors (KNN) classification detection. The proposed model trained validated against real dataset from Ibn Zohr University. results indicate outperformed other well-known algorithms detecting anomalies aforementioned dataset. experimental show can reach up 99.94 % accuracy using k-fold cross-validation method complete all 48 features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MLlib: Machine Learning in Apache Spark

Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark’s open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shi...

متن کامل

Benchmarking Apache Spark with Machine Learning Applications

We benchmarked Apache Spark with a popular parallel machine learning training application, Distributed Stochastic Gradient Descent for Matrix Factorization [5] and compared the Spark implementation with alternative approaches for communicating model parameters, such as scheduled pipelining using POSIX socket or MPI, and distributed shared memory (e.g. parameter server [13]). We found that Spark...

متن کامل

A hybrid machine learning approach to network anomaly detection

Zero-day cyber attacks such as worms and spy-ware are becoming increasingly widespread and dangerous. The existing signature-based intrusion detection mechanisms are often not sufficient in detecting these types of attacks. As a result, anomaly intrusion detection methods have been developed to cope with such attacks. Among the variety of anomaly detection approaches, the Support Vector Machine...

متن کامل

Machine Learning for Host-based Anomaly Detection

Machine Learning for Host-based Anomaly Detection by Gaurav Tandon Dissertation Advisor: Philip K. Chan, Ph.D. Anomaly detection techniques complement signature based methods for intrusion detection. Machine learning approaches are applied to anomaly detection for automated learning and detection. Traditional host-based anomaly detectors model system call sequences to detect novel attacks. This...

متن کامل

A Hybrid Machine Learning Method for Intrusion Detection

Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implemen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Advanced Computer Science and Applications

سال: 2023

ISSN: ['2158-107X', '2156-5570']

DOI: https://doi.org/10.14569/ijacsa.2023.0140496